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I. REAL PARTY IN INTEREST 

The real party in interest for this appeal and the present application is Xerox 
Corporation, by way of an Assignment recorded in the U.S. Patent and Trademark Office at 
Reel 14259, Frame 54-56 and Reel 014540, Frame 0458-0459. 
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II. STATEMENT OF RELATED APPEALS AND INTERFERENCES 

Following are identified any prior or pending appeals, interferences or judicial 

proceedings, known to Appellant, Appellant's representative, or the Assignee, that may 

be related to, or which will directly affect or be directly affected by or have a bearing 

upon the Board's decision in the pending appeal: 

Appeal Brief to be filed in copending Application No. 10/608,587 
Appeal Brief to be filed in copending Application No. 10/608,591 

There are no further prior or pending appeals, interferences or judicial 
proceedings, known to Appellant, Appellant's representative, or the Assignee, that may 
be related to, or which will directly affect or be directly affected by or have a bearing 
upon the Board's decision in the pending appeal. 
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III. STATUS OF CLAIMS 

Claims 1-48 are on appeal. 

Claims 1-38 and 45-48 are rejected, and claims 39-44 are objected to only for 
being dependent from a rejected base claim, but are otherwise allowable. 
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IV. STATUS OF AMENDMENTS 

No Amendment After Final Rejection has been filed. 
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V. SUMMARY OF CLAIMED SUBJECT MATTER 

The subject matter of independent claims 1, 26, and 37, is directed to a 
methodology for assembling a document from content spanning multiple web-pages by 
employing two cooperative processes. Given a starting location 110, one process 
analyzes a single page at a time to find candidate links 140. The links are recursively 
followed and those pages are analyzed. A detailed set of heuristics is used to 
determine what is or is not a candidate link. The candidate pages 120 are then fed to a 
document-level analyzer 150. This process compares the attributes of one page 
against the others and looks for a document-like structure. Using another detailed set 
of heuristics, the document-level analyzer 150 determines if the page should be 
included in the document, (see Abstract, page 21 of the specification as filed, and 
Figure 1 ) [in support of claims 1 , 26, and 37] 
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FIG. 1 

In more particular support of the subject matter of independent claims 26, and 
37, the page-level link analysis 140 is described in greater detail in Figure 2. During 
page-level link analysis 140, the document detection system attempts to identify links 
that may potentially lead to other pages within the same document. It is assumed that a 
well-authored multi-page document will always include progression links (links that 
provide some well-defined progression through the document, often indicated by the 
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presence of some well-known contextual clue, such as a graphic or text "next" or 
"previous" indicator) and/or table of contents links (clusters of links providing a path to 
every page or some logical subset of pages in the document) that indicate the structure 
of the document. These are the two categories of intra-document links that the link 
analysis process 140 seeks to identify, (see page 7, lines 5-15 of the specification as 
filed, and Figure 2) [in support of claims 26 and 37] 
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FIG, 2 

In more particular support of the subject matter of independent claim 37, the link 
analysis process begins with the retrieval of the actual page 270 for analysis from the 
page identifier 110. This is done as will be well understood by those skilled in the art, 
by the page retrieval process 260. The retrieved page 270 is then used as input to both 
the progression-link identification module 210 and the link-cluster identification module 
220. In the progression-link identification module 210, possible progression links 230 
are identified primarily by means of a progression indicator, which is a textual or 
graphical clue that suggests the nature of the progression link. Link-cluster 
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identification module 220 examines the page data 270 to identify link clusters and 
thereby possible table of content type links 240. The possible progression links 230 and 
possible table of content links 240 are passed to module 250 for a final examination to 
weed out links which have properties that are not characteristic of typical intra- 
document links, e.g. they point to a different web server. The final result is then a list of 
intra-document links 120 for the candidate page 270. (see page 7, lines 17-30 of the 
specification as filed, and Figure 2) [in support of claim 37] 
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VI. GROUNDS OF REJECTION TO BE REVIEWED ON APPEAL 
The following grounds of rejection are presented for review: 
Claims 1-6, 10, 12-18, 26-30, 35-38, and 46-48 are rejected under 35 U.S.C. 

§1 03(a) as being unpatentable over U.S. Patent No. 6,112,203 to Bharat et al. 

(hereinafter Bharat) in further view of U.S. Patent No. 5,924,104 to Earl (hereinafter 

Earl). 
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VII. ARGUMENT 

A. Claims 1-6. 10. 12-18. 26-30. 35-38. and 46-48 Would Not Have Been 
Obvious Over Bharat in View of Earl 

Claims 1-6, 10, 12-18, 26-30, 35-38, and 46-48 are rejected under 35 
U.S.C. §1 03(a) as being unpatentable over Bharat in further view of Earl. 

Problematically, neither Bharat or Earl, alone or in combination, teach or 
suggest the Applicants' invention. Claim elements are missing from the Bharat 
and Earl references. Indeed the references teach away from the Applicants' 
claimed invention. A Prima facie case for Obviousness has thus not been made 
out. Further, no finding has been provided directed to: a identifiable reason that 
would have prompted a person of ordinary skill in the relevant field to combine 
the elements in the way that the Applicants' claimed new invention does. Thus, 
the Applicant is faced with the conundrum of positively proving a negative. That 
is in other words: proving that something which is not there, is not there. 

Bharat teaches that in a computerized method, a set of documents is 
ranked according to their content and their connectivity by using topic distillation. 
The documents include links that connect the documents to each other, either 
directly, or indirectly. A graph is constructed in a memory of a computer system. 
In the graph, nodes represent the documents, and directed edges represent the 
links. Based on the number of links connecting the various nodes, a subset of 
documents is selected to form a topic. A second subset of the documents is 
chosen based on the number of directed edges connecting the nodes. Nodes in 
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the second subset are compared with the topic to determine similarity to the 
topic, and a relevance weight is correspondingly assigned to each node. Nodes 
in the second subset having a relevance weight less than a predetermined 
threshold are pruned from the graph. The documents represented by the 
remaining nodes in the graph are ranked by connectivity based ranking scheme. 

It is essential to the understanding Bharat that Bharat is directed to a 
search engine and as such is sorting through pages already identified by a 
simple word string search (please see column 1, lines 14-54 of Bharat). Bharat 
is concerned with solving the problem of answering a search engine query, and 
thus with ranking a set of documents to point to in response to that query. The 
Applicants however, are teaching that having identified where one document 
page is, how to find and pull together all relevant pages associated with that 
document into a single coherent document (please see page 5, first paragraph, 
of the Applicants' specification). That is, a single coherent document 
representation suitable for printing and downloaded viewing. As such the 
Applicants teach "to weed out links which have properties that are not 
characteristic of infra-document links" [a claim element of all three independent 
claims 1, 26 & 37] and thus eschew all other documents. Bharat on the other 
hand, will not link (i.e. Bharat will reject) self referencing pages so as not to 
unduly influence the search outcome (see column 5, lines 17-20) where Bharat 
provides: 

"If a link points to a page that is represented by a node in the graph, and 
both pages are on different servers, then a corresponding edge 213 is 
added to the graph 21 1 . Nodes representing pages on the same server 
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are not linked. This prevents a single Web site with many self- 
referencing pages to unduly influence the outcome. This completes 
the n-graph 211." 

Thus Bharat is interested in only infer-document links for the sake of ranking 
links. Bharat does not assemble a single coherent document but a link list of 
search results responsive to a word query. Thus Bharat does NOT examine "the 
collective set of identified candidate document pages to weed out links which 
have properties that are not characteristic of infra-document links". Thus a claim 
element is missing. 

Indeed, Bharat teaches away from the Applicants' invention. The 
Applicants teach to embrace that which Bharat discards. The cited text from the 
Applicants' specification page 7, lines 9-22 follows: 

"The link analysis process begins with the retrieval of the actual 
page 270 for analysis from the page identifier 110. This is done as will be 
well understood by those skilled in the art, by the page retrieval process 
260. The retrieved page 270 is then used as input to both the 
progression-link identification module 210 and the link-cluster identification 
module 220. In the progression-link identification module 210, possible 
progression links 230 are identified primarily by means of a progression 
indicator, which is a textual or graphical clue that suggests the nature of 
the progression link. Link-cluster identification module 220 examines the 
page data 270 to identify link clusters and thereby possible table of 
content type links 240. The possible progression links 230 and possible 
table of content links 240 are passed to module 250 for a final examination 
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to weed out links which have properties that are not characteristic of 
typical intra-document links, e.g. they point to a different web server. The 
final result is then a list of intra-document links 120 for the candidate page 
270." 

To paraphrase, the possible links are passed by the Applicants, to weed out 
those links which are not characteristic of typical intra-document links. An 
example of the links which are weeded out are those which point to a different 
web server, (please see also page 10, lines 32-34, of the Application 
Specification as filed). These links are not likely to be part of the document the 
Applicants are trying to assemble. Bharat does just the opposite, as noted 
above, so as to not unduly influence the outcome of the results to the user query. 
Please also see the attached §132 Declarations, particularly in the Sweet 
Declaration, item numbers 7-11, and in the Harrington Declaration, item number 
7. 

Earl fails to provide what Bharat lacks, nor does it provide any teaching 
relating to the Applicants' claimed invention. Earl provides link lists like Bharat 
but provides different presentation styles for the links to a user depending on 
whether they are intra-document or inter-document. Actually what Earl defines 
as intra-document is what the Applicants would call intra-page, i.e. a link pointing 
to a location somewhere further down the same page. And thus what Earl calls 
inter-document is really inter-page. The teaching found in Earl is simply about 
providing some indicia to the viewer as to whether a hyper link will take the user 
elsewhere down the present page or to an entirely different page. 
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Please see the attached §132 Declarations for what one skilled in the art 
would consider to be "intra-document" versus intra-page, particularly in the Sweet 
Declaration, item numbers 13-15, and in the Harrington Declaration, item number 
5. The Applicants are teaching assembling a document, and having identified 
the current page, have no interest in self referential links to that same page (they 
already have it) and thus would discard, or weed out, those links which Earl 
keeps. 

Earl, having made a discrimination between two type of links, keeps all 
those links, choosing only to display them differently. The Applicants having 
discriminated between links to find some as not pointing to more of the desired 
document, discard or weed out or filter out those links. A gardener, weeding out 
a flower bed and having spotted a weed, does not keep that weed in their flower 
bed to display differently. But Earl does. Thus Earl does NOT examine "the 
collective set of identified candidate document pages to weed out links which 
have properties that are not characteristic of typical intra-document links". 

In rebuttal to this previously presented argument above analogizing the 
terminology "weed out" to gardening, the Examiner has asserted that the Office 
"is forced to rely on the knowledge of one of ordinary skill in the art". The 
Applicants must emphatically traverse as to how one skilled in the art would 
interpret this terminology. Please see the attached §132 Declarations particularly 
the Sweet Declaration, item number 8 and especially item 9 for what those skilled 
in the art would consider to be meant by the terminology "weed out". Earl only 
discriminates, but does not discard, thus Earl does not "weed out". 
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Therefore, Earl in turn fails to provide the elements that Bharat also lacks, 
and thus the combination of Bharat and Earl fail to provide the requirements for a 
Prima Facie case of obviousness and the rejection is improper. 

Further, no finding has been provided by the Examiner directed to some 
genuine identifiable reason that would have prompted a person of ordinary skill in 
the relevant field to combine the elements in the way that the Applicants' claimed 
new invention does. The importance of doing so is clearly stated in KSR, 550 
U.S., 82 USPQ2d at 1395 and 1396. It would appear that the Examiner is using 
Appellant's disclosure as a recipe for selecting the appropriate portions of the 
prior art to construct Appellant's claimed invention. A piecemeal reconstruction 
of prior art patents in light of Appellant's disclosure should not be a basis for a 
holding of obviousness, especially when claim elements are absent. 
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VIII. CONCLUSION 

For all of the reasons discussed above, it is respectfully submitted that the 
rejections are in error and that claims 1-6, 10, 12-18, 26-30, 35-38, and 46-48 are in 
condition for allowance. For all of the above reasons, Appellants respectfully request this 
Honorable Board to reverse the rejections of claims 1-6, 10, 12-18, 26-30, 35-38, and 46- 
48. 

Respectfully submitted, 



/Christopher D. Wait, Reg. #43230/ 
Christopher D. Wait 
Attorney for Applicant(s) 
Registration No. 43,230 
Telephone (585) 423-6918 



XEROX CORPORATION 
Xerox Square - 20A 
Rochester, NY 14644 

Filed: October 10, 2008 
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CLAIMS APPENDIX 
CLAIMS INVOLVED IN THE APPEAL: 

1. (Previously Presented) An automated identification methodology for 
assembling a document representation for subsequent viewing or printing of a 
given hyperdocument by gathering related hyperlinked page content comprising: 

performing a page-level link analysis that identifies those intra- 
document hyperlinks on a page linking to a candidate document page; 

performing a recursive application of the page-level link analysis to the 
linked candidate document page and any further nested candidate 
document pages thereby identified, until a collective set of identified 
candidate document pages is assembled; 

performing a document-level analysis that examines the collective set 
of identified candidate document pages for grouping into one or more 
documents; 

examining the collective set of identified candidate document pages to 
weed out from that collective set of identified candidate document pages 
links which have properties that are not characteristic of intra-document 
links, to provide a resultant set of identified candidate document pages; 

grouping the content found in the resultant set of candidate document 
pages by an automated system into a document representation stored in 
memory by the automated system; and, 

printing, or viewing on a display by a user, the document 
representation. 

2. (Original) The method of claim 1 wherein the page-level link analysis 
includes retrieval of referenced pages. 

3. (Original) The method of claim 1 wherein the page-level link analysis 
includes examination of contextual clues. 

4. (Original) The method of claim 3 wherein the contextual clue is a 
particular class of content item associated with the hyperlink. 

5. (Original) The method of claim 4 wherein the class of content item is a 
class of text. 
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6. (Original) The method of claim 5 wherein the class of text is a 
directional word or phrase. 

7. (Original) The method of claim 4 wherein the class of content item is a 
class of image. 

8. (Original) The method of claim 7 wherein the class of image is an 
image containing a directional symbol. 

9. (Previously Presented) The method of claim 7 wherein a textual clue is 
obtained for the class of image. 

10. (Original) The method of claim 1 wherein the page-level link analysis 
includes the identification of progression links. 

11. (Previously Presented) The method of claim 3 wherein the contextual 
clue is the presence of at least one other hyperlink nearby with the candidate 
document page. 

12. (Previously Presented) The method of claim 3 wherein the contextual 
clue is the similarity of the hyperlink destination to that of other hyperlinks within 
the hyperdocument. 

13. (Original) The method of claim 1 wherein the page-level link analysis 
includes the identification of tables of contents. 

14. (Original) The method of claim 1 wherein the document-level analysis 
includes the identification of pages forming a chain of progression links. 

15. (Original) The method of claim 1 wherein the document-level analysis 
includes identifying the pages listed in a table of contents. 

16. (Original) The method of claim 1 wherein the document-level analysis 
includes identifying as part of the document the page containing the table of 
contents. 
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17. (Original) The method of claim 1 wherein the document-level analysis 
includes the similarity of candidate pages. 

18. (Original) The method of claim 17 wherein the similarity includes the 
location at which the page is stored. 

19. (Original) The method of claim 17 wherein the similarity includes the 
similarity of meta-data associated with the page. 

20. (Original) The method of claim 19 wherein the meta-data includes the 
author identification. 

21 . (Original) The method of claim 17 wherein the similarity includes similar 
style specifications. 

22. (Original) The method of claim 17 wherein the similarity includes similar 
page layout. 

23. (Original) The method of claim 17 wherein the similarity includes similar 
logical structure of the page content. 

24. (Original) The method of claim 17 wherein the similarity includes the 
presence of at least one similar content item on each page. 

25. (Original) The method of claim 1 wherein the document-level analysis 
includes analysis of the topological structure of the linked pages. 
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26. (Previously Presented) A system identification methodology for 
assembling a representation for subsequent viewing or printing of a given 
hyperdocument by gathering related hyperlinked document content comprising: 

performing a page-level link analysis that identifies those hyperlinks on 
a page linking to a candidate document page further comprising a 
methodology of: 

identifying possible progression links, and; 
identifying possible table of content links; 

performing a recursive application of the page-level link analysis to the 
linked candidate document page and any further nested candidate 
document pages thereby identified, until a collective set of identified 
candidate document pages is assembled; 

performing a document-level analysis that examines the collective set 
of identified candidate document pages for grouping into one or more 
documents 

examining the collective set of identified candidate document pages to 
weed out links which have properties that are not characteristic of intra- 
document links, to provide a resultant set of identified candidate document 
pages; 

grouping the content found in the resultant set of candidate document 
pages by an automated system into a document representation stored in 
memory by the automated system; and, 

printing, or viewing on a display by a user the given document 
representation. 

27. (Original) The method of claim 26 wherein the page-level link analysis 
includes examination of contextual clues. 

28. (Original) The method of claim 27 wherein the contextual clue is a 
particular class of content item associated with the hyperlink. 

29. (Original) The method of claim 28 wherein the class of content item is a 
class of text. 

30. (Original) The method of claim 29 wherein the class of text is a 
directional word or phrase. 
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31 . (Original) The method of claim 28 wherein the class of content item is a 
class of image. 

32. (Original) The method of claim 31 wherein the class of image is an 
image containing a directional symbol. 

33. (Previously Presented) The method of claim 31 wherein a textual clue 
is obtained for the class of image. 

34. (Previously Presented) The method of claim 27 wherein the contextual 
clue is the presence of at least one other hyperlink nearby with the candidate 
document page. 

35. (Previously Presented) The method of claim 27 wherein the contextual 
clue is the similarity of the hyperlink destination to that of other hyperlinks within 
the hyperdocument. 

36. (Original) The method of claim 26 wherein the document-level analysis 
includes the identification of pages forming a chain of progression links. 
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37. (Previously Presented) A system identification methodology for 
assembling a document representation for subsequent viewing or printing of a 
given hyperlinked document by gathering related hyperlinked page content 
comprising: 

performing a page-level link analysis that identifies those hyperlinks on 
a page linking to a candidate document page further comprising a methodology of: 
identifying possible progression links; 
identifying possible table of content links, and; 
examining the possible progression links and the possible table of 
content links for common characteristics; 

performing a recursive application of the page-level link analysis to the 
linked candidate document page and any further nested candidate document 
pages thereby identified, until a collective set of identified candidate document 
pages is assembled; 

performing a document-level analysis that examines the collective set 
of identified candidate document pages for grouping into one or more documents; 

examining the collective set of identified candidate document pages to 
weed out links which have properties that are not characteristic of intra-document 
links, to provide a resultant set of identified candidate document pages; and, 

grouping the content found in the resultant set of candidate document 
pages by an automated system into a document representation stored in memory 
by the automated system; and, 

printing, or viewing on a display by a user the document representation. 

38. (Original) The method of claim 37 wherein the page-level link analysis 
includes examination of contextual clues. 

39. (Original) The method of claim 38 wherein the contextual clue is a 
particular class of content item associated with the hyperlink. 

40. (Original) The method of claim 39 wherein the class of content item is a 
class of text. 

41. (Original) The method of claim 40 wherein the class of text is a 
directional word or phrase. 
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42. (Original) The method of claim 39 wherein the class of content item is a 
class of image. 

43. (Original) The method of claim 42 wherein the class of image is an 
image containing a directional symbol. 

44. (Previously Presented) The method of claim 42 wherein a textual clue 
is obtained for the class of image. 

45. (Previously Presented) The method of claim 38 wherein the contextual 
clue is the presence of at least one other hyperlink nearby with the candidate 
document page. 

46. (Previously Presented) The method of claim 38 wherein the contextual 
clue is the similarity of the hyperlink destination to that of other hyperlinks within 
the hyperdocument. 

47. (Original) The method of claim 37 wherein the document-level analysis 
includes the identification of pages forming a chain of progression links. 

48. (Original) The method of claim 37 wherein the document-level analysis 
includes the identification of pages linked to by the same tables of contents. 
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EVIDENCE APPENDIX 

A copy of each of the following items of evidence relied on by the Appellant is attached: 

DECLARATION UNDER 37 CFR §1.132 by Steven J. Harrington, Ph. D., filed 10/12/07. 
The evidence was entered into the record by the Examiner in the 12/26/07 Office Action. 

DECLARATION UNDER 37 CFR §1 .132 by James M. Sweet, filed 10/12/07. 

The evidence was entered into the record by the Examiner in the 12/26/07 Office Action. 
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RELATED PROCEEDINGS APPENDIX 
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